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The Price equation partitions total evolutionary change into two components. The first compo- 
nent provides an abstract expression of natural selection. The second component subsumes all other 
evolutionary processes, including changes during transmission. The natural selection component 
is often used in applications. Those applications attract widespread interest for their simplicity 
of expression and ease of interpretation. Those same applications attract widespread criticism by 
dropping the second component of evolutionary change and by leaving unspecified the detailed as- 
sumptions needed for a complete study of dynamics. Controversies over approximation and dynamics 
have nothing to do with the Price equation itself, which is simply a mathematical equivalence rela- 
tion for total evolutionary change expressed in an alternative form. Disagreements about approach 
have to do with the tension between the relative valuation of abstract versus concrete analyses. The 
Price equation's greatest value has been on the abstract side, particularly the invariance relations 
that illuminate the understanding of natural selection. Those abstract insights lay the foundation 
for applications in terms of kin selection, information theory interpretations of natural selection, 
and partitions of causes by path analysis. I discuss recent critiques of the Price equation by Nowak 
and van VeelerHO 



The heart and soul of much mathematics 
consists of the fact that the "same" object 
can be presented to us in different ways. 
Even if we are faced with the simple-seeming 
task of "giving" a large number, there is no 
way of doing this without also, at the same 
time, "giving" a hefty amount of extra struc- 
ture that comes as a result of the way we 
pin down — or the way we present — our large 
number. If we write our number as 1729 we 
are, sotto voce, ordering a preferred way of 
"computing it" (add one thousand to seven 
hundreds to two tens to nine). If we present it 
as 1+12 3 we are recommending another mode 
of computation, and if we pin it down — as Ra- 
manujuan did — as the first number express- 
ible as a sum of two cubes in two different 
ways, we are being less specific about how to 
compute our number, but have underscored a 
characterizing property of it within a subtle 
diophantine arena. . . . 

This issue has been with us, of course, for- 
ever: the general question of abstraction, as 
separating what we want from what we are 
presented with. It is neatly packaged in the 
Greek verb aphairein, as interpreted by Aris- 
totle in the later books of the Metaphysics to 
mean simply separation: if it is whiteness we 
want to think about, we must somehow sep- 
arate it from white horse, white house, white 
hose, and all the other white things that it 
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invariably must come along with, in order for 
us to experience it at all [TJ pp. 222-223]. 

Somewhere . . . between the specific that has 
no meaning and the general that has no con- 
tent there must be, for each purpose and at 
each level of abstraction, an optimum degree 
of generality [1 pp. 197-198]. 



INTRODUCTION 

Evolutionary theory analyzes the change in pheno- 
type over time. We may interpret phenotype broadly 
to include organismal characters, variances of characters, 
correlations between characters, gene frequency, DNA 
sequence — essentially anything we can measure. 

How does a phenotype influence its own change in fre- 
quency or the change in the frequencies of correlated phc- 
notypes? Can we separate that phenotypic influence from 
other evolutionary forces that also cause change? The as- 
sociation of a phenotype with change in frequency, sep- 
arated from other forces that change phenotype, is one 
abstract way to describe natural selection. The Price 
equation is that kind of abstract separation. 

Do we really need such abstraction, which may seem 
rather distant and vague? Instead of wasting time on 
such things as the abstract essence of natural selection, 
why not get down to business and analyze real problems? 
For example, we may wish to know how the evolution- 
ary forces of mutation and selection interact to determine 
biological pattern. We could make a model with genes 
that have phenotypic effects, selection that acts on those 
phenotypes to change gene frequency, and mutation that 
changes one gene into another. We could do some calcu- 
lations, make some predictions about, for example, the 
frequency of deleterious mutations that cause disease, 
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Box 1. Topics in the theory of natural selection 



This article is part of a series on natural selection. Although 
the theory of natural selection is simple, it remains endlessly 
contentious and difficult to apply. My goal is to make more ac- 
cessible the concepts that are so important, yet either mostly 
unknown or widely misunderstood. I write in a nontechni- 
cal style, showing the key equations and results rather than 
providing full derivations or discussions of mathematical prob- 
lems. Boxes list technical issues and brief summaries of the 
literature. 



and compare those predictions to observations. All clear 
and concrete, without need of any discussion of the 
essence of things. 

However, we may ask the following. Is there some re- 
orientation for the expression of natural selection that 
may provide subtle perspective, from which we can un- 
derstand our subject more deeply and analyze our prob- 
lems with greater ease and greater insight? My answer 
is, as I have mentioned, that the Price equation provides 
that sort of reorientation. To argue the point, I will have 
to keep at the distinction between the concrete and the 
abstract, and the relative roles of those two endpoints in 
mature theoretical understanding. 

Several decades have passed since Price's [3j 2] original 
articles. During that span, published claims, counter- 
claims and misunderstandings have accumulated to the 
point that it seems worthwhile to revisit the subject. On 
the one hand, the Price equation has been applied to nu- 
merous practical problems, and has also been elevated by 
some to almost mythical status, as if it were the ultimate 
path to enlightenment for those devoted to evolutionary 
study (Box||. 

On the other hand, the opposition has been gaining 
adherents who boast the sort of disparaging anecdotes 
and slogans that accompany battle. In a recent book, 
Nowak and Highfield [5] counter 

The Price equation did not, however, prove 
as useful as [Price and Hamilton] had hoped. 
It turned out to be the mathematical equiv- 
alent of a tautology. ... If the Price equation 
is used instead of an actual model, then the 
arguments hang in the air like a tantalizing 
mirage. The meaning will always lie just out 
of the reach of the inquisitive biologist. This 
mirage can be seductive and misleading. The 
Price equation can fool people into believing 
that they have built a mathematical model 
of whatever system they are studying. But 
this is often not the case. Although answers 
do indeed seem to pop out of the equation, 
like rabbits from a magician's hat, nothing is 
achieved in reality. 

Nowak and Highfield [5] approvingly quote van Vee- 
len et al. [5J with regard to calling the Price equation 



a mathematical tautology, van Veelen et al. 6 empha- 
size the point by saying that the Price equation is like 
soccer/football star Johan Cruyff's quip about the se- 
cret of success: "You always have to make sure that you 
score one goal more than your opponent." The state- 
ment is always true, but provides no insight. Nowak and 
Highfield [5] and van Veelen et al. [5] believe their argu- 
ments demonstrate that the Price equation is true in the 
same trivial sense, and they call that trivial type of truth 
a mathematical tautology. Interestingly, magazines, on- 
line articles, and the scientific literature have for several 
years been using the phrase mathematical tautology for 
the Price equation, although Nowak and Highfield [5] and 
van Veelen et al. [6j do not provide citations to previous 
literature. 

As far as I know, the first description of the Price equa- 
tion as a mathematical tautology was in Frank [7J. I used 
the phrase in the sense of the epigraph from Mazur, a for- 
mal equivalence between different expressions of the same 
object. Mathematics and much of statistics are about 
formal equivalences between different expressions of the 
same object. For example, the Laplace transform changes 
a mathematical expression into an alternative form with 
the same information, and analysis of variance decom- 
poses the total variance into a sum of component vari- 
ances. For any mathematical or statistical equivalence, 
value depends on enhanced analytical power that eases 
further derivations and calculations, and on the ways in 
which previously hidden relations are revealed. 

In light of the contradictory points of view, the main 
goal of this article is to sort out exactly what the Price 
equation is, how we should think about it, and its value 
and limitations in reasoning about evolution. Subsequent 
articles will show the Price equation in action, applied 
to kin selection, causal analysis in evolutionary models, 
and an information perspective of natural selection and 
Fisher's fundamental theorem. 



OVERVIEW 

The first section derives the Price equation in its full 
and most abstract form. That derivation allows us to 
evaluate the logical status of the equation in relation to 
various claims of fundamental flaw. The equation sur- 
vives scrutiny. It is a mathematical relation that ex- 
presses the total amount of evolutionary change in an 
alternative and mathematically equivalent way. That 
equivalence provides insight into aspects of natural se- 
lection and also provides a guide that, in particular ap- 
plications, often leads to good approaches for analysis. 

The second section contrasts two perspectives of evo- 
lutionary analysis. In standard models of evolutionary 
change, one begins with the initial population state and 
the rules of change. The rules of change include the fit- 
ness of each phenotype and the change in phenotype be- 
tween ancestor and descendant. Given the initial state 
and rules of change, one deduces the state of the changed 
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population. Alternatively, one may have data on the 
initial population state, the changed population state, 
and the ancestor-descendant relations that map entities 
from one population to the other. Those data may be 
reduced to the evolutionary distance between two pop- 
ulations, providing inductive information about the un- 
derlying rules of change. Natural populations have no in- 
trinsic notion of fitness or rules of change. Instead, they 
inductively accumulate information. The Price equation 
includes both the standard deductive model of evolution- 
ary change and the inductive model by which informa- 
tion accumulates in relation to the evolutionary distance 
between populations. 

The third and fourth sections discuss the Price equa- 
tion's abstract properties of invariance and recursion. 
The invariance properties include the information theory 
interpretation of natural selection. Recursion provides 
the basis for analyzing group selection and other models 
of multilevel selection. 

The fifth section relates the Price equation to various 
expressions that have been used throughout the history 
of evolutionary theory to analyze natural selection. The 
most common form describes natural selection by the co- 
variance between phenotype and fitness or by the covari- 
ance between genetic breeding value and fitness. The 
covariance expression is one part of the Price equation 
that, when used alone, describes the natural selection 
component of total evolutionary change. The essence 
of those covariance forms arose in the early studies of 
population and quantitative genetics, have been used ex- 
tensively during much of the modern history of animal 
breeding, and began to receive more mathematical de- 
velopment in the 1960s and 1970s. Recent critiques of 
the Price equation focus on the same covariance expres- 
sion that has been widely used throughout the history of 
population and quantitative genetics to analyze natural 
selection and to approximate total evolutionary change. 

The sixth section returns to the full abstract form of 
the equation. I compare a few variant expressions that 
have been promoted as improvements on the original 
Price equation. Variant forms are indeed helpful with 
regard to particular abstract problems or particular ap- 
plications. However, most variants are simply minor re- 
arrangements of the mathematical equivalence for total 
evolutionary change given by the original Price equation. 
The recent extension by Kerr and Godfrey-Smith [5j does 
provide a slightly more general formulation by expand- 
ing the fundamental set mapping that defines Price's ap- 
proach. The set mapping basis for the Price equation 
deserves more careful study and further mathematical 
work. 

The seventh section analyzes various flaws that have 
been ascribed to the Price equation. For example, the 
Price equation in its most abstract form does not con- 
tain enough information to follow evolutionary dynamics 
through multiple rounds of natural selection. By con- 
trast, classical dynamic models of population genetics are 
sufficient to follow change through time. Much has been 



Box 2. Price equation literature 



A large literature introduces and reviews the Price equation. 
I list some key references that can be used to get started 

0HQ1. 

Diverse applications have been developed with the Price 
equation. I list a few examples [17H28j . 

Quantitative genetics theory often derives from the covari- 
ance expression given by Robertson [29j . which is a form of 
the covariance term of the Price equation. The basic theory 
can be found in textbooks |30II31| . Much of the modern work 
can be traced through the widely cited article by Lande and 
Arnold [32] . 

Harman [33] provides an interesting overview of Price's 
life and evokes an Olympian sense of the power and magic 
of the Price equation. See Schwartz [34] for an alternative 
biographical sketch. 



made of this distinction with regard to dynamic suffi- 
ciency. The distinction arises from the fact that classical 
dynamics in population genetics makes more initial as- 
sumptions than the abstract Price equation. It must be 
true that all mathematical equivalences for total evolu- 
tionary change have the same dynamic status given the 
same initial assumptions. Each additional well-chosen as- 
sumption typically enhances the specificity and reduces 
the scope and generality of the analysis. The epigraph 
from Boulding emphasizes that the degree of specificity 
versus generality is an explicit choice of the analyst with 
respect to initial assumptions. 

The Discussion considers the value and limitations of 
the Price equation in relation to recent criticisms by 
Nowak and van Veelen. The critics confuse the distinct 
roles of general abstract theory and concrete dynamical 
models for particular cases. The enduring power of the 
Price equation arises from the discovery of essential in- 
variances in natural selection. For example, kin selection 
theory expresses biological problems in terms of related- 
ness coefficients. Relatedness measures the association 
between social partners. The proper measure of related- 
ness identifies distinct biological scenarios with the same 
(invariant) evolutionary outcome. Invariance relations 
provide the deepest insights of scientific thought. 



THE PRICE EQUATION 

The mathematics given here applies not only 
to genetical selection but to selection in gen- 
eral. It is intended mainly for use in deriv- 
ing general relations and constructing theo- 
ries, and to clarify understanding of selection 
phenomena, rather than for numerical calcu- 
lation [4] p. 485]. 

I have emphasized that the Price equation is a mathe- 
matical equivalence. The equation focuses on separation 



4 



of total evolutionary change into a part attributed to se- 
lection and a remainder term. That separation provides 
an abstraction of the nature of selection. As Price wrote 
sometime around 1970 but published posthumously in 
Price [35]: "Despite the pervading importance of selec- 
tion in science and life, there has been no abstraction and 
generalization from genetical selection to obtain a general 
selection theory and general selection mathematics." 

It is useful first to consider the Price equation in this 
most abstract form. I follow my earlier derivations [7J 
HOI 1241 [55] , which differ little from the derivation given 
by Price [I] when interpreted in light of Price [33] . 

The abstract expression can best be thought of in 
terms of mapping items between two sets |3S] . In biol- 
ogy, we usually think of an ancestral population at some 
time and a descendant population at a later time. Al- 
though there is no need to have an ancestor-descendant 
relation, I will for convenience refer to the two sets as an- 
cestor and descendant. What does matter is the relations 
between the two sets, as follows. 



Definitions 

The full abstract power of the Price equation requires 
adhering strictly to particular definitions. The definitions 
arise from the general expression of the relations between 
two sets. 

Let qi be the frequency of the ith type in the ancestral 
population. The index i may be used as a label for any 
sort of property of things in the set, such as allele, geno- 
type, phenotype, group of individuals, and so on. Let 
q[ be the frequencies in the descendant population, de- 
fined as the fraction of the descendant population that is 
derived from members of the ancestral population that 
have the label i. Thus, if i = 2 specifies a particular 
phenotype, then q' 2 is not the frequency of the phenotype 
2 = 2 among the descendants. Rather, it is the frac- 
tion of the descendants derived from entities with the 
phenotype i = 2 in the ancestors. One can have par- 
tial assignments, such that a descendant entity derives 
from more than one ancestor, in which case each ances- 
tor gets a fractional assignment of the descendant. The 
key is that the i indexing is always with respect to the 
properties of the ancestors, and descendant frequencies 
have to do with the fraction of descendants derived from 
particular ancestors. 

Given this particular mapping between sets, we can 
specify a particular definition for fitness. Let q[ — 
qi(wi/w), where Wi is the fitness of the ith type and 
w = ^2qiWi is average fitness. Here, Wi/w is propor- 
tional to the fraction of the descendant population that 
derives from type i entities in the ancestors. 

Usually, we are interested in how some measurement 
changes or evolves between sets or over time. Let the 
measurement for each i be Zi. The value z may be the 
frequency of a gene, the squared deviation of some phe- 
notypic value in relation to the mean, the value obtained 



by multiplying measurements of two different phenotypes 
of the same entity, and so on. In other words, Zi can be 
a measurement of any property of an entity with label, 
i. The average property value is z — ^2qiZi, where this 
is a population average. 

The value z[ has a peculiar definition that parallels 
the definition for q\. In particular, z[ is the average mea- 
surement of the property associated with z among the 
descendants derived from ancestors with index i. The 
population average among descendants is z' = X^i 2 ^'- 

The Price equation expresses the total change in the 
average property value, Az — z! — z, in terms of these 
special definitions of set relations. This way of expressing 
total evolutionary change and the part of total change 
that can be separated out as selection is very different 
from the usual ways of thinking about populations and 
evolutionary change. The derivation itself is very easy, 
but grasping the meaning and becoming adept at using 
the equation is not so easy. 

I will present the derivation in two stages. The first 
stage makes the separation into a part ascribed to selec- 
tion and a part ascribed to property change that covers 
everything beyond selection. The second stage retains 
this separation, changing the notation into standard sta- 
tistical expressions that provide the form of the Price 
equation commonly found in the literature. I follow with 
some examples to illustrate how particular set relations 
are separated into selection and property change compo- 
nents. The next section considers two distinct interpre- 
tations of the Price equation in relation to dynamics. 



Derivation: separation into selection and property 
value change 

We use Aqi = q[ — qi for frequency change associ- 
ated with selection, and Azi = z[ — Zj for property value 
change. Both expressions for change depend on the spe- 
cial set relation definitions given above. 

We are after an alternative expression for total change, 
Az. Thus, 

Az = z! - z 

= ~ q A 

= X)«i(A«i)+X)( A ft)«*- 

Switching the order of the terms on the right side of the 
last line yields 

Az = J2( A( li>i + J2 q i( Azi ^ W 

a form emphasized by Frank [101 eqn 1]. The first term 
separates the part of total change caused by changes in 
frequency. We call this the part caused by selection, be- 
cause this is the part that arises directly from differential 
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contribution by ancestors to the descendant population 
[35] . Because the set mappings define all of the direct 
attributions of success for each i with respect to the as- 
sociated properties Zi, it is reasonable to separate out 
this direct component as the abstraction of selection. It 
is of course possible to define other separations. I discuss 
one particular alternative later. However, it is hard to 
think of other separations that would describe selection 
in a better way at the most abstract and general level 
of the mappings between two sets. This first term has 
also been called the partial evolutionary change caused 
by natural selection (Eq. Q). 

The second term describes the part of total change 
caused by changes in property values. Recall that Azi = 
z[ — Zi, and that z[ is the property value among enti- 
ties that descend from i. Many different processes may 
cause descendant property values to differ from ances- 
tral values. In fact, the assignment of a descendant to an 
ancestor can be entirely arbitrary, so that there is no rea- 
son to assume that descendants should be like ancestors. 
Usually, we will work with systems in which descendants 
do resemble ancestors, but the degree of such associations 
can be arranged arbitrarily. This term for change in prop- 
erty value encompasses everything beyond selection. The 
idea is that selection affects the relative contribution of 
ancestors and thus the changes in frequencies of repre- 
sentation, but what actually gets represented among the 
descendants will be subject to a variety of processes that 
may alter the value expressed by descendants. 

The equation is exact and must apply to every evolu- 
tionary system that can be expressed as two sets with 
certain ancestor-descendant or mapping relations. It is 
in that sense that I first used the phrase mathematical 
tautology [TJ. The nature of separation and abstraction 
is well described by the epigraph from Mazur at the start 
of this article. 



Derivation: statistical notation 

Price [4| used statistical notation to write Eq. |l]). For 
the first term, by following prior definitions we have 

Aft = ft' - ft 

Wi 

= ft — - ft 

w 

\ w J 

so that 

^(Aft)z; = ^ ft i^~=r - l) *i = Cov(w, z)/w, 

using the standard definition for population covariance. 
For the second term, we have 

yV(Azi) = Y / q i ^(Az i ) = E{wAz)/w, 

where E means expectation, or average over the full pop- 
ulation. Putting these statistical forms into Eq. (fil and 



moving w to the left side for notational convenience yields 
a commonly published form of the Price equation 

wAz = Cov(w,z) +E(wAz). (2) 

Price [35] and Frank [7j present examples of set mappings 
expressed in relation to the Price equation. 



DYNAMICS: INDUCTIVE AND DEDUCTIVE 
PERSPECTIVES 

The Price equation describes evolutionary change be- 
tween two populations. Three factors express one itera- 
tion of dynamical change: initial state, rules of change, 
and next state. In the Price equation, the phenotypes, 
Zi, and their frequencies, ft, describe the initial popula- 
tion state. Fitnesses, Wi, and property changes, Azi, set 
the rules of change. Derived phenotypes, z it and their 
frequencies, q' iy express the next population state. 

Models of evolutionary change essentially always an- 
alyze forward or deductive dynamics. In that case, one 
starts with initial conditions and rules of change and cal- 
culates the next state. Most applications of the Price 
equation use this traditional deductive analysis. Such 
applications lead to predictions of evolutionary outcome 
given assumptions about evolutionary process, expressed 
by the fitness parameters and property changes. 

Alternatively, one can take the state of the initial pop- 
ulation and the state of the changed population as given. 
If one also has the mappings between initial and changed 
populations that connect each entity, i, in the initial pop- 
ulation to entities in the changed population, then one 
can calculate (induce) the underlying rules of change. At 
first glance, this inductive view of dynamics may seem 
rather odd and not particularly useful. Why start with 
knowledge of the evolutionary sequence of population 
states and ancestor-descendant relations as given, and in- 
ductively calculate fitnesses and property changes? The 
inductive view takes the fitnesses, Wi, to be derived from 
the data rather than an intrinsic property of each type. 

The Price equation itself does not distinguish between 
the deductive and inductive interpretations. One can 
specify initial state and rules of change and then deduce 
outcome. Or one can specify initial state and outcome 
along with ancestor-descendant mappings, and then in- 
duce the underlying rules of change. It is useful to under- 
stand the Price equation in its full mathematical gener- 
ality, and to understand that any specific interpretation 
arises from additional assumptions that one brings to a 
particular problem. Much of the abstract power of the 
Price equation comes from understanding that, by itself, 
the equation is a minimal description of change between 
populations. 

The deductive interpretation of the Price equation is 
clear. What value derives from the inductive perspec- 
tive? In observational studies of evolutionary change, we 
only have data on population states. From those data, 
we use the inductive perspective to make inferences about 
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the underlying rules of change. Note that inductive es- 
timates for evolutionary process derive from the amount 
of change, or distance, between ancestor and descendant 
populations. The Price equation includes that inductive, 
or retrospective view, by expressing the distance between 
populations in terms of Az. I develop that distance in- 
terpretation in the following sections. 

Perhaps more importantly, natural selection itself is 
inherently an inductive process by which information ac- 
cumulates in populations. Nature does not intrinsically 
"know" of fitness parameters. Instead, frequency changes 
and the mappings between ancestor and descendant are 
inherent in a population's response to the environment, 
leading to a sequence of population states, each separated 
by an evolutionary distance. That evolutionary distance 
provides information that populations accumulate induc- 
tively about the fitnesses of each phenotype [36 . The 
Price equation includes both the deductive and inductive 
perspectives. We may choose to interpret the equation 
in either way depending on our goals of analysis. 

ABSTRACT PROPERTIES: INVARIANCE 

The Price equation describes selection by the term 
J2(Aqi)zi = Cov(w, z)/w. Any instance of evolutionary 
change that has the same value for this sum has the same 
amount of total selection. Put another way, for any par- 
ticular value for total selection, there is an infinite num- 
ber of different combinations of frequency changes and 
character measurements that will add up to the same 
total value for selection. All of those different combina- 
tions lead to the same value with respect to the amount 
of selection. We may say that all of those different com- 
binations are invariant with respect to the total quantity 
of selection. The deepest insights of science come from 
understanding what does not matter, so that one can also 
say exactly what does matter — what is invariant |37l I38j . 

The invariance of selection with respect to transforma- 
tions of the fitnesses, w, and the phenotypes, z, that have 
the same Cov(w, z) means that, to evaluate selection, it 
is sufficient to analyze this covariance. At first glance, it 
may seem contradictory that the covariance, commonly 
thought of as a linear measure of association, can be 
a complete description for selection, including nonlinear 
processes. Let us step through this issue, first looking at 
why the covariance is a sufficient expression of selection, 
and then at the limitations of this covariance expression 
in evolutionary analysis. 

Covariance as a measure of distance: definitions 

Much of the confusion with respect to covariance and 
variance terms in selection equations arises from think- 
ing only of the traditional statistical usage. In statistics, 
covariance typically measures the linear association be- 
tween pairs of observations, and variance is a measure of 



the squared spread of observations. Alternatively, covari- 
ances and variances provide measures of distance, which 
ultimately can be understood as measures of information 
[3"6] . This section introduces the notation for the geomet- 
ric interpretation of distance. The next section gives the 
main geometric result, and the following section presents 
some examples. 

The identity ^(Aqi)z; = Cov(u>, z)/w provides the 
key insight. It helps to write this identity in an alter- 
native form. Note from the prior definition q l i = qiWi/w 
that 

Aft = q'i -q% = qi{w l /w - 1) = q^i, (3) 

where ai — Wi jw — 1 is Fisher's average excess in fitness, 
a commonly used expression in population and quantita- 
tive genetics [3HHH] • A value of zero means that an entity 
has average fitness, and therefore fitness effects and se- 
lection do not change the frequency of that entity. Using 
the average excess in fitness, we can write the invariant 
expression for selection as 

^(Agi)zj = qiaiZi = Cov(w, z)/w. (4) 

We can think of the state of the population as the 
listing of character states, zi. Thus we write the pop- 
ulation state as z = (zi,z%, . . .). The subscripts run 
over every different entity in the population, so the vec- 
tor z is a complete description of the entire population. 
Similarly, for the frequency fluctuations, Aqi — qidi, 
we can write the listing of all fluctuations as a vector, 
Aq= (A qi ,Aq 2 ,...). 

It is often convenient to use the dot product notation 

Aq • z = ^(Agj)zi = Cov(u>, z)/w 

in which the dot specifies the sum obtained by multiply- 
ing each pair of items from two vectors. Before turning 
to some geometric examples in the following section, we 
need a definition for the length of a vector. Traditionally, 
one uses the definition 

ii-ii = v / E^' 

in which the length is the square root of the sum of 
squares, which is the standard measure of length in Eu- 
clidean geometry. 

Covariance as a measure of distance: examples 

A simple identity relates a dot product to a measure 
of distance and to covariance selection 

Aq • z = || Aq| ||z|| cos <fi — Cov(w, z)/w, (5) 

where cf> is the angle between the vectors Aq and z 
(Fig. [lj. If we standardize the character vector t 
/ / then the standardized vector has a length of one, 
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||z-| = 1, which simplifies the dot product expression of 
selection to 

Aq • z-= || Aq| cos</>, 

providing the geometric representation illustrated in 
Fig. [1] 

The covariance can be expressed as the product of a 
regression coefficient and a variance term 

Cav(w 1 z)/w ~ I3 zw \&r(w) /w — /3 wz Vax(z)/w, (6) 

where the notation (3 xy describes the regression coeffi- 
cient of x on y [3j. This identity shows that the expres- 
sion of selection in terms of a regression coefficient and 
a variance term is equivalent to the geometric expression 
of selection in terms of distance. 

I emphasize these identities for two reasons. First, as 
Mazur stated in the epigraph: "The heart and soul of 
much mathematics consists of the fact that the 'same' 
object can be presented to us in different ways." If an 
object is important, such as natural selection surely is, 
then it pays to study that object from different perspec- 
tives to gain deeper insight. 

Second, the appearance of statistical functions, such 
as the covariance and variance, in selection equations 
sometimes leads to mistaken conclusions. In the selec- 
tion equations, it is better to think of the covariance and 
variance terms arising because they are identities with ge- 
ometric or other interpretations of selection, rather than 
thinking of those terms as summary statistics of proba- 
bility distributions. The problem with thinking of those 
terms as statistics of probability distributions is that the 
variance and covariance are not in general sufficient de- 
scriptions for probability distributions. That lack of suffi- 
ciency for probability may lead one to conclude that those 
terms are not sufficient for a general expression of selec- 
tion. However, those covariance and variance terms are 
sufficient. That sufficiency can be understood by think- 
ing of those terms as identities for distance or measures 
of information [36] - 

It is true that in certain particular applications of 
quantitative genetics or stochastic sampling processes, 
one does interpret the variances and covariances as sum- 
mary statistics of probability distributions, usually the 
normal or Gaussian distribution. However, it is impor- 
tant to distinguish those special applications from the 
general selection equations. 

Invariance and information 

For the general selection expression in Eq. any 
transformations that do not affect the net values are in- 
variant with respect to selection. For example, trans- 
formations of the fitnesses and associated frequency 
changes, Aq, are invariant if they leave unchanged the 
distance expressed by Aq • z = Cov(w, z)/w. Similarly, 
changes in the pattern of phenotypes are invariant to the 




FIG. 1. Geometric expression of selection. The plots show the 
equivalence of the dot product, the geometric expression and 
the covariance, as given in Eq. For both plots, z = (I, 4) 
and «-= z/||z|| — (0.24, 0.97). The dashed line shows the per- 
pendicular between the pattern of frequency changes derived 
from fitnesses, Aq, and the phenotypic pattern, z-. The ver- 
tex of the two vectors is at the origin (0, 0). The distance from 
the origin to the intersection of the perpendicular along z- is 
the total amount of selection, ||Aq|| cos<^>. (a) The vector of 
frequency changes that summarize fitness is Aq = (—0.4, 0.4). 
The angle between the vector of frequency changes and the 
phenotypes is <f) = arccos [(Aq ■ a)/||Aq||] which, in this ex- 
ample, is 1.03 radians or 59°. In this case, the total selection 
is ||Aq||cos0 = 0.29. (b) In this plot, Aq = (0.4,-0.4), 
yielding an angle <f> of 121°. The perpendicular intersects 
the negative projection of the phenotype vector, shown as a 
dashed line, associated with the negative change by selection 
of ||Aq|| cos0 = -0.29. 
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extent that they leave Aqz unchanged. These invariance 
properties of selection, measured as distance, may not 
appear very interesting at first glance. They seem to be 
saying that the outcome is the outcome. However, the 
history of science suggests that studying the invariant 
properties of key expressions can lead to insight. 

Few authors have developed an interest in the invariant 
qualities of selection. Fisher 39j initiated discussion with 
his fundamental theorem of natural selection, a special 
case of Eq. ^ [TO]. Although many authors commented 
on the fundamental theorem, most articles did not ana- 
lyze the theorem with respect to its essential mathemat- 
ical insights about selection. Ewens [12] reviewed the 
few attempts to understand the mathematical basis of 
the theorem and its invariant quantities. Frank [36] tied 
the theorem to Fisher information [331 SI]> hinting at 
an information theory interpretation that arises from the 
fundamental selection equation of Eq. ^ . 

In spite of the importance of selection in many fields 
of science, the potential interpretation of Eq. (|5| with 
respect to invariants of information theory has hardly 
been developed. I briefly outline the potential connec- 
tions here [36] . I develop this information perspective of 
selection in a later article, along with Fisher's fundamen- 
tal theorem. 

To start, define the partial change in phenotype caused 
by natural selection as 



A s z = Aq 



Cav(w, z)/w. 



(7) 



The concept of a partial change caused by natural selec- 
tion arises from Fisher's fundamental theorem [39l l45l - 
[47] . With this definition, we can use eqns [5] and |6] to 
write 



A s z = /3 zw Var(w)/w = w/3 xw Vaj:(w/w). 



(8) 



From Eq. ([3]), we have the definition for the average ex- 
cess in fitness = Wi/w — 1. Thus, we can expand the 
expression for the variance in fitness as 



Va,r(w/w) 



V w ) 



From Eq. ([3]), we also have the change in frequency in 
terms of the average excess, Aqi = qi(ii, and equivalently, 
a t = Aqi/qi, thus 



Y&t(w/w) 



q, 



Aq % 
Aq • Aq, 



E 



where Aqi — Aqi/^fql is a standardized fluctuation in 
frequency, and Aq is the vector of standardized fluctu- 
ations. These alternative forms simply express the vari- 
ance in fitness in different ways. The interesting result 
follows from the fact that 

V&r(w/w) = Aq • Aq = F(Aq) 



is the Fisher information, F, in the frequency fluctua- 
tions, Aq. Fisher information is a fundamental quantity 
in information theory, Bayesian analysis, likelihood the- 
ory and the informational foundations of statistical in- 
ference. Fisher information is a variant form of the more 
familiar Shannon and Kullback-Leibler information mea- 
sures, in which the Fisherian form expresses changes in 
information. 

Once again, we have a simple identity. Although it 
is true that Fisher information is just an algebraic rear- 
rangement of the variance in fitness, some insight may be 
gained by relating selection to information. The variance 
form calls to mind a statistical description of selection or 
a partial description of a probability distribution. The 
Fisher information form suggests a relation between nat- 
ural selection and the way in which populations accumu- 
late information [36 . 



We may now write our fundamental expression for se- 
lection as 



A s z = wp zw F(Aq). 



We may read this expression for selection as: the change 
in mean character value caused by natural selection, A s z, 
is equal to the total Fisher information in the frequency 
fluctuations, F, multiplied the scaling j3 that describes 
the amount of the potential information that the pop- 
ulation captures when expressed in units of phenotypic 
change. In other words, the distance A s z measures the 
informational gain by the population caused by natural 
selection. 

The invariances set by this expression may be viewed 
in different ways. For example, the distance of evolution- 
ary change by selection, A s z, is invariant with respect to 
many different combinations of frequency fluctuations, 
Aq, and scalings between phenotype and fitness. Simi- 
larly, any transformations of frequency fluctuations that 
leave the measure of information, F'(Aq), invariant do 
not alter the scaled change in phenotype caused by natu- 
ral selection. The full implications remain to be explored. 

NOTE ADDED AFTER PUBLICATION OF 
THE JOURNAL ARTICLE. I have oversimplified a 
bit in this section, with the aim to keep the presenta- 
tion brief. The proper expression of Fisher information is 
F(Aq)(A0) 2 , where AO is the scale of change over which 
population differences are measured, typically taken as 

Afl^oGH- 
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Summary of selection identities 

The various identities for the part of total evolutionary 
change caused by selection include 

A s z = Cov(w, z)/w 
= w/3 zw Vax(w/w) 
= Aq ■ z 
- ||Aq||||z||cos0 
= wP zw (Aq- Aq) 
= w(3 zw F{Aq). (9) 

These forms show the equivalence of the statistical, ge- 
ometrical and informational expressions for natural se- 
lection. These general abstract forms make no assump- 
tions about the nature of phenotypes and the patterns 
of frequency fluctuations caused by differential fitness. 
The phenotypes may be squared deviations so that the 
average is actually a variance, or the product of mea- 
surements on different characters leading to measures of 
association, or any other nonlinear combination of mea- 
surements. Thus, there is nothing inherently linear or 
restrictive about these expressions. 



Selection versus evolution 

The previous sections discussed the part of evolution- 
ary change caused by selection. The full Price equation 
(Eq. ([2])) gives a complete and exact expression of total 
change, repeated here as 

Az = Cov(w, z) /w + E(wAz) /w (10) 

or in terms of the dot product notation as 

Az = Aq • z + q' • Az. (11) 

The full change in the phenotype is the sum of the two 
terms, which we may express in symbols as 

Az = A s z + A E z. 

Fisher [35J called the term A E z the change caused by the 
environment [47] . However, the word environment often 
leads to confusion. The proper interpretation is that A E z 
encompasses everything not included in the expression for 
selection. The term is environmental only in the sense 
that it includes all those forces external to the particular 
definition of the selective forces for a particular problem. 

The A E term is sometimes associated with changes in 
transmission [7J [TUJ H3 03]. This interpretation arises 
because E(wAz) is the fitness weighted changes in char- 
acter value between ancestor and descendant. One may 
think of changes in character values as changes during 
transmission. 

It is important to realize that everything truly means 
every possible force that might arise and that is not ac- 
counted for by the particular expression for selection. 



Lightning may strike. New food sources may appear. 
The Price equation in its general and abstract form is a 
mathematical identity — what I previously called a math- 
ematical tautology [7]. 

In applications, one considers how to express A E z, 
or one searches for ways to formulate the problem so 
that A E z is zero or approximately zero. This article is 
not about particular applications. Here, I simply note 
that when one works with Fisher's breeding value as z, 
then near equilibria (fixed points), one typically obtains 
Az — > and thus E(wAz) — > 0. In other cases, the 
search for a good way to express a problem means find- 
ing a form of character measurement that defines z such 
that characters tend to remain stable over time, so that 
Az —> and thus E(wAz) — > 0. For applications that 
emphasize calculation of complex dynamics rather than a 
more abstract conceptual analysis of a problem, methods 
other than the Price equation often work better. 



ABSTRACT PROPERTIES: RECURSION AND 
GROUP SELECTION 

To iterate is human, to recurse, divine [49] , 

Essentially all modern discussions of multilevel selection 
and group selection derive from Price [J, as developed 
by Hamilton [5J. Price and Hamilton noted that the 
Price equation can be expanded recursively to represent 
nested levels of analysis, for example, individuals living 
in groups. 

Start with the basic Price equation as given in Eq. ( 10 1. 



The left side is the total change in average phenotype, z. 
The second term on the right side includes the terms Azi 
in E(wAz) = ^2 qiWiAzi. 

Recall that in defining Zj, we specified the meaning of 
the index i to be any sort of labeling of set members, sub- 
ject to minimal consistency requirements. We may, for 
example, label all members of a group by i, and measure 
Zi as some property of the group. If the index i itself rep- 
resents a set, then we may consider the members of that 
set. For example, Zy may be the jth member of the ith 
set, or we may say, the ith group. In the abstract math- 
ematical expression, there is no need to think of the ith 
group as having any spatial or biological meaning. How- 
ever, we may consider i as a label for spatially defined 
groups if we wish to do so. 

With i defining a group, we may analyze the selection 
and evolution of that ith group. The term Azt becomes 
the average change in the z measure for the ith group, 
composed of members with values z,-j . The terms z[j are 
the average property values of the descendants of the jth 
entity in the ith group. The descendant entities that de- 
rive from the ith group do not have to form any sort of 
group or other meaningful structuring, just as the origi- 
nal i labeling does not have to refer to group structuring 
in the ancestors. However, we may if we wish consider 
descendants of i as retaining some sense of the ancestral 
grouping. 



10 



Because 2, represents an averaging over the entities j in 
the ith group, we are assuming the notational equivalence 
Azi = Azi. From that point of view, for each group i we 
may from Eq. ( 10 ) express the change in the group mean 



by thinking of each group as a separate set or population, 
yielding for each i the expression 

Azi = Azi = Cov(w l; Zi)/wi + E(wiAzi)/wi. 

We may substitute this expression for each i into the 



E(wAz) = Y^, QiWiAzi term on the right side of Eq. (10 1. 
That substitution recursively expands each change in 
property value, Az,, to itself be composed of a selection 
term and property value change term. For each group, i, 
we now have expressions for selection within the group, 
Gov(wi ,Zi)/wi, and average property value change within 
the group, E(wjAz i )/?Zi i . If we write out the full expres- 
sion for this last term, we obtain 

EiwiAz^/wi = y^^WjjAzjj/wj. 



In the term Az.^-, each labeling, j, may itself be a sub- 
group within the larger grouping represented by i. The 
recursive nature of the Price equation allows another ex- 
pansion to the characters Zijk for the fcth entity in the 
jth grouping that is nested in the iih group, and so on. 
Once again, the indexing for levels i, j, and k do not have 
to correspond to any particular structuring, but we may 
choose to use a structuring if we wish. 

One could analyze biological problems of group selec- 
tion without using the Price equation. Because the Price 
equation is a mathematical identity, there are always 
other ways of expressing the same thing. However, in 
the 1970s, when group selection was a very confused sub- 
ject, the Price equation's recursive nature and Hamilton's 
development provided the foundation for subsequent un- 
derstanding of the topic. All modern conceptual insights 
about group selection derive from Price's recursive ex- 
pansion of his abstract expression of selection. 



HISTORY AND ALTERNATIVE EXPRESSIONS 
OF SELECTION 

I have emphasized the general and abstract form of 
the Price equation. That abstract form was first pre- 
sented rather cryptically by Price [3]. In that article, 
Price described the recursive expansion to analyze group 
selection. Apart from the recursive aspect, the more gen- 
eral abstract properties were hardly mentioned in Price 
[3] and not developed by others until 1995. 

While I was writing my history of Price's contributions 
to evolutionary genetics [7j, I found Price's unpublished 
manuscript The nature of selection among W. D. Hamil- 
ton's papers. Price's unpublished manuscript gave a very 
general and abstract scheme for analyzing selection in 
terms of set relations. However, Price did not explic- 
itly connect the abstract set relation scheme to the Price 
equation or to his earlier publications [3J 13] • 



I had The nature of selection published posthumously 
as Price |35j . In my own article, I explicitly developed 
the general interpretation of the Price equation as the 
formal abstract expression of the relation between two 
sets [7j. 

Price [3j wrote an earlier article in which he presented 
a covariance selection equation that emphasized the con- 
nection to classical models of population genetics and 
gene frequency change. That earlier covariance form 
lacks the abstract set interpretation and generally has 
narrower scope. Preceding Price, Robertson and Li 
[5U] also presented selection equations that are similar 
to Price's [3J covariance expression. Robertson's covari- 
ance form itself arises from classical quantitative genetics 
and the breeder's equation, ultimately deriving from the 
foundations of quantitative genetics established by Fisher 
[51] . Li's form presents a covariance type of expression 
for classical population genetic models of gene frequency 
change. 

One cannot understand the current literature without 
a clear sense of this history. Almost all applications of 
the Price equation to kin and group selection, and to 
other problems of evolutionary analysis, derive from ei- 
ther the classical expressions of quantitative genetics |H] 
or classical expressions of population genetics [50] . 

In light of this history, criticisms can be confusing with 
regard to the ways in which the Price equation is com- 
monly used. For example, in applications to kin or group 
selection, the Price equation mainly serves to package the 
notation for the Robertson form of quantitative genetic 
analysis or the Li form of population genetic analysis. 
The Price equation packaging brings no extra assump- 
tions. In some applications, critics may believe that the 
particular analysis lacks enough assumptions to attain a 
desired level of specificity. One can, of course, easily add 
more assumptions, at the expense of reduced generality. 

The following sections briefly describe some alternative 
forms of the Price equation and the associated history. 
That history helps to place criticisms of the Price equa- 
tion and its applications into clearer light. 



Quantitative genetics and the breeder's equation 

Fisher [5T] established the modern theory of quantita- 
tive genetics, following the early work of Galton, Pear- 
son, Weldon, Yule and others. The equations of selection 
in quantitative genetics and animal breeding arose from 
that foundation. Many modern applications of the Price 
equation to particular problems follow this tradition of 
quantitative genetics. A criticism of these Price equation 
applications is a criticism of the central approach of evo- 
lutionary quantitative genetics. Such criticisms may be 
valid for certain applications, but they must be evaluated 
in the broader context of quantitative genetics theory. 
This section shows the relation between quantitative ge- 
netics and a commonly applied form of the Price equation 
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Evolutionary aspects of quantitative genetics devel- 
oped from the breeder's equation 

R = Sh 2 , 

in which the response to selection, i?, equals the selection 
differential, S, multiplied by the heritability, h 2 . The 
separation of selection and transmission is the key to the 
breeder's equation and to quantitative genetics theory. 

The covariance term of the Price equation is equivalent 
to the selection differential, S, when one interprets the 
meaning of fitness and descendants in a particular way. 
Suppose that we label each potential parent in the ances- 
tral population of size N with the index, i. The initial 
weighting of each parent in the ancestral population is 
q.i = 1/N. Assign to each potential parent a weighting 
with respect to breeding contribution, q[ — qtWi, with fit- 
nesses standardized so that id — I and the Wi are relative 
fitnesses. 

With this setup, ancestors are the initial population 
of potential parents, each weighted equally, and descen- 
dants arc the same population of parents, weighted by 
their breeding contribution. The character value for 
each individual remains unchanged between the ances- 
tor and descendant labelings. These assumptions lead 
to Az* = Cov(u>,z), the change in the average charac- 
ter value between the breeding population and the initial 
population. That difference is defined as S, the selection 
differential. 

To analyze the fraction of the selection differential 
transmitted to offspring, classical quantitative genetics 
follows Fisher [ST] to separate the character value as 
z = g + e, with a transmissible genetic component, g, and 
a component that is not transmitted, which we may call 
the environmental or unexplained component, e. Follow- 
ing standard regression theory for this sort of expression, 
e = 0. 

For a parent with z = g + e, the average character 
value contribution ascribed to the parent among its de- 
scendants is z' = g, following the idea that g represents 
the component of the parental character that is trans- 
mitted to offspring. If we assume that the only fluctu- 
ations of average character value in offspring are caused 
by the transmissible component that comes from parents, 
then the genetic component measured by g is sufficient 
to explain expected offspring character values. Thus, 
Az = z' — z = — e, and E(j«Az) = — Cov(w, e). 

Substituting into the full Price equation from Eq. |2| 
and assuming w — 1 so that all fitnesses are normalized 

Az = Cov(w, z) + E(wAz) 

= Cov(w, g) + Cov(w, e) — Cov(u>, e) 

= Cov(w,g). (12) 

The expression Az — Cov(w,g) was first emphasized by 
Robertson [55], and is sometimes called Robertson's sec- 
ondary theorem of natural selection. Robertson's expres- 
sion summarizes the foundational principles of quantita- 
tive genetics, as conceived by Fisher [ST] and developed 
over the past century J3UJ G>U IS3] • 



It is commonly noted that Robertson's theorem is re- 
lated to the classic breeder's equation. In particular, 

R = Az = Cov(w, g) = Cov(w, z)h 2 = Sh 2 , 

where R is the response to selection, S — Cov(w, z) is the 
selection differential, and h 2 = Var(<?) /Var(z) is a form of 
heritability, a measure of the transmissible genetic com- 
ponent. Additional details and assumptions can be found 
in several articles and texts [TUJ O [SI] . 

Population genetics and the covariance expression 

Price [3] expressed his original formulation in terms of 
gene frequency change and classical population genetics, 
rather than the abstract set relations that I have empha- 
sized. At that time, it seems likely that Price already 
had the broader, more abstract theory in hand, and was 
presenting the population genetics form because of its 
potential applications. The article begins 

This is a preliminary communication describ- 
ing applications to genetical selection of a new 
mathematical treatment of selection in gen- 
eral. 

Gene frequency change is the basic event 
in biological evolution. The following 
equation. . .which gives frequency change un- 
der selection from one generation to the next 
for a single gene or for any linear function of 
any number of genes at any number of loci, 
holds for any sort of dominance or epistasis, 
for sexual or asexual reproduction, for ran- 
dom or nonrandom mating, for diploid, hap- 
loid or polyploid species, and even for imagi- 
nary species with more than two sexes. . . 

Using my notation, Price writes the basic covariance form 

AP = Cov(w,p)/w = P wp Vai(p)/w. (13) 

In a simple application, p could be interpreted as gene 
frequency at a single diploid locus with two alleles. Then 
P = p is the gene frequency in the population, and /3 wp is 
the regression of individual fitness on individual gene fre- 
quency, in which the individual gene frequency is either 
0, 1/2 or 1 for an individual with 0, 1 or 2 copies of the 
allele of interest. Li [51]] gave an identical gene frequency 
expression in his eqn 4. 

In more general applications, one can study a p-score 
that summarizes the number of copies of various alleles 
present in an individual, or in whatever entities are be- 
ing tracked. In classical population genetics, the p-score 
would be, in Price's words above, "any linear function of 
any number of genes at any number of loci." Here, lin- 
earity means that p is essentially a counting of presence 
versus absence of various things within the ith entity. 
Such counting does not preclude nonlinear interactions 
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between alleles or those things being counted with re- 
spect to phenotype, which is why Price said that the 
expression holds for any form of dominance or epistasis. 

Hamilton |17) used Price's gene frequency form in his 
first clear derivations of the direct and the inclusive fit- 
ness models of kin selection theory. Most early appli- 
cations of the Price equation used this gene frequency 
interpretation. 



Price [3] emphasized that the value of Eq. ( 13 ) arises 
from its benefits for qualitative reasoning rather than cal- 
culation. The necessary assumptions can be seen from 
the form given by Price, which is always exact, here writ- 
ten in my notation 

AP = Cov(w,p) /to + E(wAp) /to, 

where Ap is interpreted as the change in state between 
parental gene frequency for the ith entity and the average 
gene frequency for the part of descendants derived from 
the zth entity. 

In practice, Ap = usually means Mendelian segre- 
gation, no biased mutation, and no sampling biases as- 
sociated with drift. Most population genetics theory of 
traits such as social behavior typically make those as- 



sumptions, so that Eq. ( 13 ) is sufficient with respect to 



analyzing change in gene frequency or in p-scores |55j . 
However, the direction of change in gene frequency or p- 
score is not sufficient to predict the direction of change 
in phenotype. To associate the direction of change in p- 
score to the direction of change in phenotype, one must 
make the assumption that phenotype changes monotoni- 
cally with p-score. Such monotonicity is a strong assump- 
tion, which is not always met. For that reason, p-score 
models sometimes buy simplicity at a rather high cost. 
In other applications, monotonicity is a reasonable as- 
sumption, and the p-score models provide a very simple 
and powerful approach to understanding the direction of 
evolutionary change. 

The costs and benefits of the p-score model are not par- 
ticular to the Price equation. Any analysis based on the 
same assumptions has the same limitations. The Price 
equation provides a concise and elegant way to explore 
the consequences when certain simplifying assumptions 
can reasonably be applied to a particular problem. 



ALTERNATIVE FORMS OR 
INTERPRETATIONS OF THE FULL EQUATION 

The full Price equation partitions total evolutionary 
change into components. Many alternative partitions ex- 
ist. A partition provides value if it improves conceptual 
clarity or eases calculation. 

Which partitions are better than others? Better is al- 
ways partly subjective. What may seem hard for me may 
appear easy to you. Nonetheless, it would be a mistake to 
suggest that all differences are purely subjective. Some 
forms are surely better than others for particular prob- 
lems, even if better remains hard to quantify. As Russell 



[5UI p. 14] said in another context, "All such conventions 
are equally legitimate, though not all are equally conve- 
nient." 

Many partitions of evolutionary change include some 
aspect of selection and some aspect of property or trans- 
mission change. Most of those variants arise by minor 
rearrangements or extensions of the basic Price expres- 
sion. A few examples follow. 



Contextual analysis 

Heisler and Damuth |57j introduced the phrase con- 
textual analysis to the evolutionary literature. Contex- 
tual analysis is a form of path analysis, which partitions 
causes by statistical regression models. Path analysis has 
been used throughout the history of genetics (SS] . It is a 
useful approach whenever one wishes to partition varia- 
tion with respect to candidate causes. The widely used 
method of Lande and Arnold [32] to analyze selection is 
a particular form of path analysis. 

Okasha [15] argued that contextual analysis is an alter- 
native to the Price equation. To develop a simple exam- 
ple, let us work with just the selection part of the Price 
equation 

wAz — Cov(u>, z). 

A path (contextual) analysis refines this expression by 
partitioning the causes of fitness with a regression equa- 
tion. Suppose we express fitness as depending on two 
predictors: the focal character that we are studying, z, 
and another character, y. Then we can write fitness as 

to = /3 wz z + (3 wy y + e 

in which the j3 terms are partial regressions of fitness 
on each character, and e is the unexplained residual of 
fitness. Substituting into the Price equation, we get the 
sort of expression made popular by Lande and Arnold 
[32] 

wAz = /3 t0Z Var(z) + f3 wy Cov(y, z). 

If the partitioning of fitness into causes is done in a 
useful way, this type of path analysis can provide signifi- 
cant insight. I based my own studies of natural selection 
and social evolution on this approach jTQl [21] ■ 

Authors such as Okasha [T5] consider the partition- 
ing of fitness into distinct causes as an alternative to 
the Price equation. If one thinks of the character z in 
Cov(u>, z) as a complete causal explanation for fitness, 
then a partition into separate causes y and z does in- 
deed lead to a different causal understanding of fitness. 
In that regard, the Price equation and path analysis lead 
to different causal perspectives. 

One can find articles that use the Price equa- 
tion and interpret z as a lone cause of fitness [? 
]see^okasha06evolution. Thus, if one equates those spe- 
cific applications with the general notion of the Price 
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equation, then one can say that path or contextual analy- 
sis provides a significantly different perspective from the 
Price equation. To me, that seems like a socially con- 
structed notion of logic and mathematics. If someone 
has applied an abstract truth in a specific way, and one 
can find an alternative method for the same specific ap- 
plication that seems more appealing, then one can say 
that the alternative method is superior to the general 
abstract truth. 

The abstract Price equation does not compel one to 
interpret z strictly as a single cause explanation. Rather, 
in the general expression, z should always be interpreted 
as an abstract placeholder. Path (contextual) analysis 
follows as a natural extension of the Price equation, in 
which one makes specific models of fitness expressed by 
regression. It does not make sense to discuss the Price 
equation and path analysis as alternatives. 



Alternative partitions of selection and transmission 

In the standard form of the Price equation, the fitness 
term, w, appears in both components 

wAz — Cov(w, z) + E(wAz). 

Frank [lOj [24] derived an alternative expression 

= q 'i z 'i ~ H z i 

= Y 1i( w iM) Z 'i ~ Y qiZ 'i + E qiZ i ~ E qiZl 
= 22 qi (Wi/™ ~ *) Z 'i + 22 qi ( Z 'i ~ Z ^ 

= Cov(w,z')/w + E(Az). (14) 

This form sometimes provides an easier method to cal- 
culate effects. For example, the second term now ex- 
presses the average change in phenotype between parent 
and offspring without weighting by fitness effects. A bi- 
ased mutational process would be easy to calculate with 
this expression — one only needs to know about the mu- 
tation process to calculate the outcome. The new covari- 
ance term can be partitioned into meaningful components 
with minor assumptions [101 p. 1721], yielding 

Cov(w,z') — Cov(u>, z)Pz> z , 

where /3 z i z is usually interpreted as the offspring-parent 
regression, which is a type of heritability. Thus, we 
may combine selection with the heritability component 
of transmission into the covariance term, with the sec- 
ond term containing only a fitness-independent measure 
of change during transmission. 

Okasha [15] strongly favored the alternative partition 
for the Price equation in Eq. ( 14 ) , because it separates all 



are costs and benefits for the standard Price equation 



expression compared with Eq. ( 14 ). One gains by having 



both, and using the particular form that fits a particular 
problem. 

For example, the term E(Az) is useful when one has to 
calculate the effects of a biased mutational process that 
operates independently of fitness. Alternatively, suppose 
most individuals have unbiased transmission, such that 
Az = 0, whereas very sick individuals do not reproduce 
but, if they were to reproduce, would have a very biased 
transmission process. Then E(Az) differs significantly 
from zero, because the sick, nonreproducing individuals 
appear in this term equally with the reproducing popula- 
tion. However, the actual transmission bias that occurs 
in the population would be zero, E(wAz) = 0, because 
all reproducing individuals have nonbiased transmission. 

Both the standard Price form and the alternative in 
Eq. ( 14 1 can be useful. Different scenarios favor different 
ways of expressing problems. I cannot understand why 
one would adopt an a priori position that unduly limits 
one's perspective. 



Extended set mapping expression 

The Price equation's power arises from its abstraction 
of selection in terms of mapping relations between sets 
[2l[35]. Although the Price equation is widely cited in the 
literature, almost no work has developed the set mapping 
formalism beyond the description given in the initial pub- 
lications. I know of only one article. 

Kerr and Godfrey-Smith [8] noted that, in the original 
Price formulation, every descendant must derive from one 
or more ancestors. There is no natural way for novel 
entities to appear. In applications, new entities could 
arise by immigration from outside the system or, in a 
cultural interpretation, by de novo generation of an idea 
or behavior. 

Kerr and Godfrey-Smith [8] present an extended ex- 
pression to handle unconnected descendants. Their for- 
mulation depends on making explicit the connection 
number between each individual ancestor and each in- 
dividual descendant, rather than using the fitnesses of 
types. Some descendants may have zero connections. 

With an explicit description of connections, an ex- 
tended Price equation follows. The two core components 
of covariance for selection and expected change for trans- 
mission occur, plus a new factor to account for novel de- 
scendants unconnected to ancestors. 

The notation in Kerr and Godfrey-Smith [5] is com- 
plex, so I do not repeat it here. Instead, I show a simpli- 
fied version. Suppose that a fraction p of the descendants 
are unconnected to ancestors. Then we can write the av- 
erage trait value among descendants as 



z = p 



E 



1 3 



fitness effects in the first term from a pure transmission 
interpretation of the second term. In my view, there 



where z* is the phenotype for the jth member of the 
descendant population that is unconnected to ancestors, 
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and aj is the frequency of each unconnected type, with 
^2 Oij = 1. Given those definitions, we can proceed with 
the usual Price equation expression 

Az = z' ~ z 

=p^2 a i z j + i 1 - p) X q ' iZ * - (p + 1 - p)Yl qiZi 

= 0--p) (X q '* z * ~ X + P (X ~ X 9A ) 

Note that the term weighted by 1—p leads to the standard 
form of the Price equation, so we can write 

Az = (1 - p) (Cov(w, z) + E(wAz)) /to +p a j z j - X 

= (1 - p) (Cov(w, z) + E(wAz)) /w + p (z* - z) . 

In the component weighted by p, no connections exist 
between the descendant z* and a member of the ances- 
tral population. Thus, we have no basis to relate those 
terms to fitness, transmission, or property change. Kerr 
and Godfrey-Smith [H] use an alternative notation that 
associates all entities with their number of connections, 
including those with zero. The outcome is an extended 
set mapping theory for evolutionary change. The main 
concepts and the value of the approach are best explained 
by the application presented in the next section. 

Gains and losses in descendants and ancestors 



added species would be expected to function as an aver- 
age species, and so interpret this term as the contribution 
of random species gain. The term Sz is interpreted simi- 
larly as random species loss with respect to the S unique 
species in the first ecosystem not present in the second 
ecosystem. 

Fox & Kerr partition the term s c (Az) into three com- 
• ponents of species function: deviation from the average 
for species gained at the second site, deviation from the 
average for species lost from the first site, and the changes 
in function for those species in common between sites. 

The point here concerns the approach rather than the 
^tffejory of ecosystem function. To analyze changes be- 
tween two sets, one often benefits by an explicit decom- 
position of the relations between the two sets. The orig- 
inal Price equation is one sort of decomposition, based 
on tracing the ways in which descendants derive from 
and change with respect to ancestors. Fox and Kerr (55] 
extend the decomposition of change by set mapping to 
include specific components that make sense in the con- 
text of changes in ecosystem function. 

More work on the mathematics of set mapping and 
decomposition would be very valuable. The Price equa- 
tion and the extensions by Kerr, Godfrey-Smith, and Fox 
show the potential for thinking carefully about the ab- 
stract components of change between sets, and how to ap- 
ply that abstract understanding to particular problems. 



Fox and Kerr [59| analyze changes in ecosystem func- 
tion by modifying the method of Kerr and Godfrey- 
Smith 8 . They measure ecosystem function by sum- 
ming the functional contribution of each species present 
in an ecosystem. To compare ecosystems, they consider 
an initial site and a second site. When comparing ecosys- 
tems, the notion of ancestors and descendants may not 
make sense. Instead, one appeals to the more general set 
mapping relations of the Price equation. 

Assume that there is an initial site with total function 
T = z i, where Zi is the function of the zth species. At 
the initial site, there are s different species, thus we may 
also express the total as T = sz, where z is the average 
function per species. At a second site, total function is 
T' = ^2 z'j , with s' different species in the summation, 
and T' — s'z'. Let the number of species in common 
between the sites be s c . Thus, the initial site has S — 
s — s c unique species, and the second site has S' = s' — s c 
unique species. 

Fox and Kerr [35] write the change in total ecosystem 
function as 

AT = T' — T = s'z' - sz 

= (s' - s c )z' - (s - s c )z + s c (z' - z) 
= S'z' — Sz + s c (Az). 

The term S'z' represents the change in function caused 
the gain of an average species, in which S' is the num- 
ber of newly added species, and z' is the average func- 
tion per species. Fox & Kerr suggest that a randomly 



Other examples 

No clear guidelines determine what constitutes an ex- 
tension to the Price equation. From a broad perspective, 
many different partitions of total change have similari- 
ties, because they separate something like selection from 
other forces that alter the similarity between populations. 

For example, the stochastic effects of sampling and 
drift create a distribution of descendant phenotypes 
around the ancestral mean. In the classical Price formu- 
lation, there is only the single realization of the actual 
descendants. A stochastic version analyzes a collection 
of possible descendant sets over some probability distri- 
bution, and a mapping from the ancestor set to each 
possible realization of the descendant set. 

In other cases, partitions will split components more 
finely or add new components not in Price's formulation. 
I do not have space to review every partition of total 
change and consider how each may be related to Price's 
formulation. I list a few examples here. 

Grafen [SD] and Rice [ST] developed stochastic ap- 
proaches. Grafen |27] based a long-term project on in- 
terpretations and extensions of the Price equation. Page 
and Nowak [T^] related the Price equation to various 
other evolutionary analyses, providing some minor exten- 
sions. Wolf et al. [62], Bijma and Wade [63], and many 
others developed extended partitions by splitting causes 
with regression or similar methods such as path analysis. 
Various forms of the Price equation have been applied in 
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economic theory [T3"] . 

DIFFICULTIES WITH VARIOUS CRITIQUES OF 
THE PRICE EQUATION 

A reliable way to make people believe in false- 
hoods is frequent repetition, because familiar- 
ity is not easily distinguished from truth 64, 
p. 62]. 

One must distinguish the full, exact Price equation from 
various derived forms used in applications. The derived 
forms always make additional assumptions or express 
approximate relations [10) . Each assumption increases 
specificity and reduces generality in relation to particu- 
lar goals. 

Critiques of the Price equation rarely distinguish the 
costs and benefits of particular assumptions in relation 
to particular goals. I use van Veelen's recent series of 
papers as a proxy for those critiques. That series repeats 
some of the common misunderstandings and adds some 
new ones. Nowak recently repeated van Veelen's critique 
as the basis for his commentary on the Price equation 

0I3I55HB5]. 

Dynamic sufficiency 

The Price equation describes the change in some mea- 
surement, expressed as Az. Change is calculated with 
respect to particular mapping relations between ances- 
tor and descendant populations. We can think of the 
mappings and the beginning value of z as the initial con- 
ditions or inputs, and Az as the output. 

The output, z' = z + Az, does not provide enough in- 
formation to iterate the calculation of change in order to 
get another value of Az starting with z' . We would also 
need the mapping relations between the new descendant 
population and its subsequent descendants. That infor- 
mation is not part of the initial input. Thus, we cannot 
study the dynamics of change over time without addi- 
tional information. 

This limitation with regard to repeated iteration is 
called a lack of dynamic sufficiency |69] . Confusion about 
the nature of dynamic sufficiency in relation to the Price 
equation has been common in the literature. In Frank [7, 
pp. 378-379], I wrote 

It is not true, however, that dynamic suffi- 
ciency is a property that can be ascribed to 
the Price Equation — this equation is simply 
a mathematical tautology for the relationship 
among certain quantities of populations. In- 
stead, dynamic sufficiency is a property of 
the assumptions and information provided 
in a particular problem, or added by addi- 
tional assumptions contained within numer- 
ical techniques such as diffusion analysis or 



applied quantitative genetics. . . . What prob- 
lems can the Price equation solve that cannot 
be solved by other methods? The answer is, 
of course, none, because the Price Equation 
is derived from, and is no more than, a set of 
notational conventions. It is a mathematical 
tautology. 

I showed how the Price equation helps to define the neces- 
sary conditions for dynamic sufficiency. Once again, the 
Price equation proves valuable for clarifying the abstract 
structure of evolutionary analysis. 

Compare my statement to van Veelen et al. [5] 

Dynamic insufficiency is regularly mentioned 
as a drawback of the Price equation (see for 
example Frank, 1995; Rice, 2004). We think 
that this is not an entirely accurate descrip- 
tion of the problem. We would like to ar- 
gue that the perception of dynamic insuffi- 
ciency is a symptom of the fundamental prob- 
lem with the Price equation, and not just a 
drawback of an otherwise fine way to describe 
evolution. To begin with, it is important to 
realize that the Price equation itself, by its 
very nature, cannot be dynamically sufficient 
or insufficient. The Price equation is just an 
identity. If we are given a list of numbers that 
represent a transition from one generation to 
the next, then we can fill in those numbers in 
both the right and the left hand side of the 
Price equation. The fact that it is an identity 
guarantees that the numbers that appear on 
both sides of the equality sign are the same. 
There is nothing dynamically sufficient or in- 
sufficient about that (this point is also made 
by Gardner et al., 2007, p. 209). A model, on 
the other hand, can be dynamically sufficient 
or insufficient. 

This quote from van Veelen et al. [B] demonstrates an 
interesting approach to scholarship. They first cite Frank 
as stating that dynamic insufficiency is a drawback of the 
Price equation. They then disagree with that point of 
view, and present as their own interpretation an argu- 
ment that is nearly identical in concept and phrasing to 
my own statement in the very paper that they cited as 
the foundation for their disagreement. 

In this case, I think it is important to clarify the con- 
cepts and history, because influential and widely cited 
authors, such as Nowak, are using van Veelen's articles as 
the basis for their own critiques of the Price equation and 
approaches to fundamental issues of evolutionary analy- 
sis. 

With regard to dynamics, any analysis achieves the 
same dynamic status given the same underlying assump- 
tions. The Price equation, when used with the same 
underlying assumptions as population genetics, has the 
same attributes of dynamic sufficiency as population ge- 
netics. 
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Interpretation of covariance 

van Veelen et al. [6] claim that 

Maybe the most unfortunate thing about the 
Price equation is that the term on the right 
hand side is denoted as a covariance, even 
though it is not. The equation thereby turns 
into something that can easily set us off in 
the wrong direction, because it now resembles 
equations as they feature in other sciences, 
where probabilistic models are used that do 
use actual covariances. 

One can see the covariance expression in the standard 
form of the Price equation given in Eq. ([2]). In the Price 
equation, the covariance is measured with respect to the 
total population, in other words, it expresses the associa- 
tion over all members of the population. In many statis- 
tical applications, one only has data on a subset of the 
full population, that subset forming a sample. It is im- 
portant to distinguish between population measures and 
sample measures, because they refer to different things. 

Price [U p. 485] made clear that his equation is about 
total change in entire populations, so the covariance is 
interpreted as a population measure 

[W]e will be concerned with population func- 
tions and make no use of sample functions, 
hence we will not observe notational conven- 
tions for distinguishing population and sam- 
ple variables and functions. 

In additional to population and sample measures, co- 
variance also arises in mathematical models of process. 
Suppose, for example, that I develop a model in which 
random processes influence fitness and random processes 
influence phenotype. If the random fluctuations in fitness 
and the random fluctuations in phenotype are associated, 
the random variables of fitness and phenotype would co- 
vary. All of these different interpretations of covariance 
are legitimate, they simply reflect different situations. 



DISCUSSION 

In Frank [7J, I wrote: "What problems can the Price 
equation solve that cannot be solved by other methods? 
The answer is, of course, none, because the Price Equa- 
tion is derived from, and is no more than, a set of nota- 
tional conventions. It is a mathematical tautology." 

Nowak and Highfield [5] and van Veelen et al. [5] em- 
phasize the same point in their critique of the Price equa- 
tion, although they present the argument as a novel in- 
sight without attribution. Given that the Price equa- 
tion is a set of notational conventions, it cannot uniquely 
specify any predictions or insights. A particular set of as- 
sumptions leads to the same predictions, no matter what 
notational conventions one uses. The Price equation is 



a tool that sometimes helps in analysis or in seeing gen- 
eral connections between apparently disparate ideas. For 
many problems, the Price equation provides no value, be- 
cause it is the wrong tool for the job. 

If the Price equation is just an equivalence, or tautol- 
ogy, then why am I enthusiastic about it? Mathematics 
is, in its essence, about equivalences, as expressed beau- 
tifully in the epigraph from Mazur. Not all equivalences 
are interesting or useful, but some are, just as not all 
mathematical expressions are interesting or useful, but 
some are. 

That leads us to the question of how we might know 
whether the Price equation is truly useful or a mere iden- 
tity? It is not always easy to say exactly what makes an 
abstract mathematical equivalence interesting or useful. 
However, given the controversy over the Price equation, 
we should try. Because there is no single answer, or even 
a truly unique and unambiguous question, the problem 
remains open. I list a few potential factors. 

"[A] good notation has a subtlety and suggestiveness 
which at times make it seem almost like a live teacher" 
[70l pp. 17-18]. Much of creativity and understanding 
comes from seeing previously hidden associations. The 
tools and forms of expression that we use play a strong 
role in suggesting connections and are inseparable from 
cognition |64j . Equivalences and alternative notations 
are important. 

The various forms of the covariance component from 
the Price equation given in Eq. ([9| show the equivalence 
of the statistical, geometrical and informational expres- 
sions for natural selection. The recursive form of the full 
Price equation provides the foundation for all modern 
studies of group selection and multilevel analysis. The 
Price equation helped in discovering those various con- 
nections, although there are many other ways in which 
to derive the same relations. 

Hardy |71) also emphasized the importance of seeing 
new connections between apparently disparate ideas: 

We may say, roughly, that a mathematical 
idea is 'significant' if it can be connected, in 
a natural and illuminating way, with a large 
complex of other mathematical ideas. Thus 
a serious mathematical theorem, a theorem 
which connects significant ideas, is likely to 
lead to important advances in mathematics 
itself and even in other sciences. 

What sort of connections? One type concerns the in- 
variances discovered or illuminated by the Price equation. 
I discussed some of those invariances in an earlier section, 
particularly the information theory interpretation of nat- 
ural selection through the measure of Fisher information 
|36j . Fisher's fundamental theorem of natural selection 
is a similar sort of invariance [75]. Kin selection theory 
derives much of its power by identifying an invariant in- 
formational quantity sufficient to unify a wide variety of 
seemingly disparate processes (2H Chapter 6] . The inter- 
pretation of kin selection as an informational invariance 
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has not been fully developed and remains an open prob- 
lem. 

Invariances provide the foundation of scientific under- 
standing: "It is only slightly overstating the case to say 
that physics is the study of symmetry" [73]. Invariance 
and symmetry mean the same thing [38]. Feynman [37] 
emphasized that invariance is The Character of Physi- 
cal Law. The commonly observed patterns of probability 
can be unified by the study of invariance and its associa- 
tion to measurement [TH [75] . There has been little effort 
in biology to pursue similar understanding of invariance 
and measurement [TBJ HZ] ■ 

Price argued for the great value of abstraction, in the 
sense of the epigraph from Mazur. In Price |35j 

[D]espite the pervading importance of selec- 
tion in science and life, there has been no 
abstraction and generalization from genetical 
selection to obtain a general selection theory 
and general selection mathematics. Instead, 
particular selection problems are treated in 
ways appropriate to particular fields of sci- 
ence. Thus one might say that 'selection the- 
ory' is a theory waiting to be born — much 
as communication theory was 50 years ago. 
Probably the main lack that has been holding 
back any development of a general selection 
theory is lack of a clear concept of the general 
nature or meaning of 'selection'. 

This article has been about the Price equation in re- 
lation to its abstract properties and its connections to 
various topics, such as information or fundamental in- 



variances. Some readers may feel that those aspects of 
abstraction and invariance are nice, but far from daily 
work in biology. What of the many applications of the 
Price equation to kin or group selection? Do those ap- 
plications hold up? How much value has been added? 

Because the Price equation is a tool, one can always 
arrive at the same result by other methods. How well 
the Price equation works depends partly on the goal and 
partly on the fit of the tool to the problem. There is 
inevitably a strongly subjective aspect to deciding about 
how well a tool works. Nonetheless, hammers truly are 
good for nails and bad for screws. For valuing tools, there 
is a certain component that should be open to agreement. 
For example, the Robertson (29! form of the Price equa- 
tion is widely regarded as the foundational method for 
analyzing models of evolutionary quantitative genetics. 
However, not all problems in quantitative genetics are 
best studied with the Robertson-Price equation. And 
not all problems in social evolution benefit from a Price 
equation approach. 

The Price equation or descendant methods have led 
to many useful models for kin selection [53]. The most 
powerful follow a path analysis decomposition of causes 
or use a simple maximization method to analyze easily 
what would otherwise have been difficult. I will return 
to those applications in subsequent articles. 
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